AITopics | naive method

Collaborating Authors

naive method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Teaching Old Tokenizers New Words: Efficient Tokenizer Adaptation for Pre-trained Models

Purason, Taido, Chizhov, Pavel, Yamshchikov, Ivan P., Fishel, Mark

arXiv.org Artificial IntelligenceDec-4-2025

Tokenizer adaptation plays an important role in transferring pre-trained language models to new domains or languages. In this work, we address two complementary aspects of this process: vocabulary extension and pruning. The common approach to extension trains a new tokenizer on domain-specific text and appends the tokens that do not overlap with the existing vocabulary, which often results in many tokens that are unreachable or never used. We propose continued BPE training, which adapts a pre-trained tokenizer by continuing the BPE merge learning process on new data. Experiments across multiple languages and model families show that this approach improves tokenization efficiency and leads to better utilization of added vocabulary. We also introduce leaf-based vocabulary pruning, which removes redundant tokens while preserving model quality. Together, these methods provide practical tools for controlled vocabulary modification, which we release as an open-source package.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2512.03989

Country:

Asia > Middle East > UAE (0.46)
Europe > Germany (0.46)
North America > Mexico (0.28)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.30)

Add feedback

ACS: An interactive framework for conformal selection

Gui, Yu, Jin, Ying, Nair, Yash, Ren, Zhimei

arXiv.org Machine LearningJul-22-2025

This paper presents adaptive conformal selection (ACS), an interactive framework for model-free selection with guaranteed error control. Building on conformal selection (Jin and Candès, 2023b), ACS generalizes the approach to support human-in-the-loop adaptive data analysis. Under the ACS framework, we can partially reuse the data to boost the selection power, make decisions on the fly while exploring the data, and incorporate new information or preferences as they arise. The key to ACS is a carefully designed principle that controls the information available for decision making, allowing the data analyst to explore the data adaptively while maintaining rigorous control of the false discovery rate (FDR). Based on the ACS framework, we provide concrete selection algorithms for various goals, including model update/selection, diversified selection, and incorporating newly available labeled data. The effectiveness of ACS is demonstrated through extensive numerical simulations and real-data applications in large language model (LLM) deployment and drug discovery.

large language model, machine learning, selection, (20 more...)

arXiv.org Machine Learning

2507.15825

Country:

North America > United States > Pennsylvania (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Reviews: Robust Principal Component Analysis with Adaptive Neighbors

Neural Information Processing SystemsJan-24-2025, 18:42:36 GMT

Update: Thanks for the feedback and I have read them. Yet I don't think it has convinced me to change my decision. For Q2, if the framework is general, the authors should have extended it more than one case. Otherwise, the authors should focus on PCA instead of claiming the framework to be general. For Q3 and Q4, I think the discussion on how to choose k and d is not sufficient in the paper.

adaptive neighbor, different weight, robust principal component analysis, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.40)

Add feedback

Maximizing the Impact of Deep Learning on Subseasonal-to-Seasonal Climate Forecasting: The Essential Role of Optimization

Guo, Yizhen, Zhou, Tian, Jiang, Wanyi, Wu, Bo, Sun, Liang, Jin, Rong

arXiv.org Artificial IntelligenceNov-23-2024

Weather and climate forecasting is vital for sectors such as agriculture and disaster management. Although numerical weather prediction (NWP) systems have advanced, forecasting at the subseasonal-to-seasonal (S2S) scale, spanning 2 to 6 weeks, remains challenging due to the chaotic and sparse atmospheric signals at this interval. Even state-of-the-art deep learning models struggle to outperform simple climatology models in this domain. This paper identifies that optimization, instead of network structure, could be the root cause of this performance gap, and then we develop a novel multi-stage optimization strategy to close the gap. Extensive empirical studies demonstrate that our multi-stage optimization approach significantly improves key skill metrics, PCC and TCC, while utilizing the same backbone structure, surpassing the state-of-the-art NWP systems (ECMWF-S2S) by over \textbf{19-91\%}. Our research contests the recent study that direct forecasting outperforms rolling forecasting for S2S tasks. Through theoretical analysis, we propose that the underperformance of rolling forecasting may arise from the accumulation of Jacobian matrix products during training. Our multi-stage framework can be viewed as a form of teacher forcing to address this issue. Code is available at \url{https://anonymous.4open.science/r/Baguan-S2S-23E7/}

artificial intelligence, forecasting, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.16728

Country:

North America > United States > Virginia (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
(3 more...)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reviews: Learning with Feature Evolvable Streams

Neural Information Processing SystemsOct-8-2024, 01:15:33 GMT

This paper formalizes a new problem setting, Feature Evolvable Streaming Learning. Sensors or other devices to extract feature values have the limited lifespans; therefore, these devices have been periodically replaced and the associated feature space changes. This learning paradigm prepares the overlapping period to adapt to the new feature space. In this overlapping period, learning algorithms receive features from both the old devices and the new devices simultaneously to capture the relationship between two feature spaces. This paper develops two learning algorithms to efficiently use previous experiences extracted from old training data to train/predict in the new feature space: 1) the weighted combination based predictor ensemble method, 2) the dynamic classifier selection.

feature evolvable stream, feature space, learning, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Efficient Privacy-Preserving KAN Inference Using Homomorphic Encryption

Lai, Zhizheng, Zhou, Yufei, Zheng, Peijia, Chen, Lin

arXiv.org Artificial IntelligenceSep-12-2024

The recently proposed Kolmogorov-Arnold Networks (KANs) offer enhanced interpretability and greater model expressiveness. However, KANs also present challenges related to privacy leakage during inference. Homomorphic encryption (HE) facilitates privacy-preserving inference for deep learning models, enabling resource-limited users to benefit from deep learning services while ensuring data security. Yet, the complex structure of KANs, incorporating nonlinear elements like the SiLU activation function and B-spline functions, renders existing privacy-preserving inference techniques inadequate. To address this issue, we propose an accurate and efficient privacy-preserving inference scheme tailored for KANs. Our approach introduces a task-specific polynomial approximation for the SiLU activation function, dynamically adjusting the approximation range to ensure high accuracy on real-world datasets. Additionally, we develop an efficient method for computing B-spline functions within the HE domain, leveraging techniques such as repeat packing, lazy combination, and comparison functions. We evaluate the effectiveness of our privacy-preserving KAN inference scheme on both symbolic formula evaluation and image classification. The experimental results show that our model achieves accuracy comparable to plaintext KANs across various datasets and outperforms plaintext MLPs. Additionally, on the CIFAR-10 dataset, our inference latency achieves over 7 times speedup compared to the naive method.

activation function, dataset, inference, (14 more...)

arXiv.org Artificial Intelligence

2409.07751

Country:

North America > United States > Washington > King County > Redmond (0.04)
North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Assisted Path Planning for a UGV-UAV Team Through a Stochastic Network

Bhadoriya, Abhay Singh, Rathinam, Sivakumar, Darbha, Swaroop, Casbeer, David W., Manyam, Satyanarayana G.

arXiv.org Artificial IntelligenceDec-28-2023

In this article, we consider a multi-agent path planning problem in a stochastic environment. The environment, which can be an urban road network, is represented by a graph where the travel time for selected road segments (impeded edges) is a random variable because of traffic congestion. An unmanned ground vehicle (UGV) wishes to travel from a starting location to a destination while minimizing the arrival time at the destination. UGV can traverse through an impeded edge but the true travel time is only realized at the end of that edge. This implies that the UGV can potentially get stuck in an impeded edge with high travel time. A support vehicle, such as an unmanned aerial vehicle (UAV) is simultaneously deployed from its starting position to assist the UGV by inspecting and realizing the true cost of impeded edges. With the updated information from UAV, UGV can efficiently reroute its path to the destination. The UGV does not wait at any time until it reaches the destination. The UAV is permitted to terminate its path at any vertex. The goal is then to develop an online algorithm to determine efficient paths for the UGV and the UAV based on the current information so that the UGV reaches the destination in minimum time. We refer to this problem as Stochastic Assisted Path Planning (SAPP). We present Dynamic $k$-Shortest Path Planning (D*KSPP) algorithm for the UGV planning and Rural Postman Problem (RPP) formulation for the UAV planning. Due to the scalability challenges of RPP, we also present a heuristic based Priority Assignment Algorithm (PAA) for the UAV planning. Computational results are presented to corroborate the effectiveness of the proposed algorithm to solve SAPP.

algorithm, uav, ugv, (15 more...)

arXiv.org Artificial Intelligence

2312.1734

Country:

North America > United States > Texas > Brazos County > College Station (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > District of Columbia > Washington (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry:

Transportation > Infrastructure & Services (0.88)
Transportation > Ground > Road (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.86)

Add feedback

Can Agents Run Relay Race with Strangers? Generalization of RL to Out-of-Distribution Trajectories

Lan, Li-Cheng, Zhang, Huan, Hsieh, Cho-Jui

arXiv.org Artificial IntelligenceApr-26-2023

In this paper, we define, evaluate, and improve the "relay-generalization" performance of reinforcement learning (RL) agents on the out-of-distribution "controllable" states. Ideally, an RL agent that generally masters a task should reach its goal starting from any controllable state of the environment instead of memorizing a small set of trajectories. For example, a self-driving system should be able to take over the control from humans in the middle of driving and must continue to drive the car safely. To practically evaluate this type of generalization, we start the test agent from the middle of other independently well-trained stranger agents' trajectories. With extensive experimental evaluation, we show the prevalence of generalization failure on controllable states from stranger agents. For example, in the Humanoid environment, we observed that a well-trained Proximal Policy Optimization (PPO) agent, with only 3.9% failure rate during regular testing, failed on 81.6% of the states generated by well-trained stranger PPO agents. To improve "relay generalization," we propose a novel method called Self-Trajectory Augmentation (STA), which will reset the environment to the agent's old states according to the Q function during training. After applying STA to the Soft Actor Critic's (SAC) training procedure, we reduced the failure rate of SAC under relay-evaluation by more than three times in most settings without impacting agent performance and increasing the needed number of environment interactions. Our code is available at https://github.com/lan-lc/STA. Generalization is critical for deploying reinforcement learning (RL) agents into real-world applications. A well-trained RL agent that can achieve high rewards under restricted settings may not be able to handle the enormous state space and complex environment variations in the real world. There are many different aspects regarding the generalization of RL agents.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2304.13424

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Knowledge Distillation-Based Backdoor Attack in Federated Learning

Wang, Yifan, Fan, Wei, Yang, Keke, Alhusaini, Naji, Li, Jing

arXiv.org Artificial IntelligenceAug-12-2022

Federated Learning (FL) is a novel framework of decentralized machine learning. Due to the decentralized feature of FL, it is vulnerable to adversarial attacks in the training procedure, e.g. , backdoor attacks. A backdoor attack aims to inject a backdoor into the machine learning model such that the model will make arbitrarily incorrect behavior on the test sample with some specific backdoor trigger. Even though a range of backdoor attack methods of FL has been introduced, there are also methods defending against them. Many of the defending methods utilize the abnormal characteristics of the models with backdoor or the difference between the models with backdoor and the regular models. To bypass these defenses, we need to reduce the difference and the abnormal characteristics. We find a source of such abnormality is that backdoor attack would directly flip the label of data when poisoning the data. However, current studies of the backdoor attack in FL are not mainly focus on reducing the difference between the models with backdoor and the regular models. In this paper, we propose Adversarial Knowledge Distillation(ADVKD), a method combine knowledge distillation with backdoor attack in FL. With knowledge distillation, we can reduce the abnormal characteristics in model result from the label flipping, thus the model can bypass the defenses. Compared to current methods, we show that ADVKD can not only reach a higher attack success rate, but also successfully bypass the defenses when other methods fails. To further explore the performance of ADVKD, we test how the parameters affect the performance of ADVKD under different scenarios. According to the experiment result, we summarize how to adjust the parameter for better performance under different scenarios. We also use several methods to visualize the effect of different attack and explain the effectiveness of ADVKD.

backdoor attack, model update, participant, (14 more...)

arXiv.org Artificial Intelligence

2208.06176

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
Asia > China > Anhui Province > Hefei (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(10 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE

Zhuang, Juntang, Dvornek, Nicha, Li, Xiaoxiao, Tatikonda, Sekhar, Papademetris, Xenophon, Duncan, James

arXiv.org Machine LearningJun-3-2020

Neural ordinary differential equations (NODEs) have recently attracted increasing attention; however, their empirical performance on benchmark tasks (e.g. image classification) are significantly inferior to discrete-layer models. We demonstrate an explanation for their poorer performance is the inaccuracy of existing gradient estimation methods: the adjoint method has numerical errors in reverse-mode integration; the naive method directly back-propagates through ODE solvers, but suffers from a redundantly deep computation graph when searching for the optimal stepsize. We propose the Adaptive Checkpoint Adjoint (ACA) method: in automatic differentiation, ACA applies a trajectory checkpoint strategy which records the forward-mode trajectory as the reverse-mode trajectory to guarantee accuracy; ACA deletes redundant components for shallow computation graphs; and ACA supports adaptive solvers. On image classification tasks, compared with the adjoint and naive method, ACA achieves half the error rate in half the training time; NODE trained with ACA outperforms ResNet in both accuracy and test-retest reliability. On time-series modeling, ACA outperforms competing methods. Finally, in an example of the three-body problem, we show NODE with ACA can incorporate physical knowledge to achieve better accuracy. We provide the PyTorch implementation of ACA: \url{https://github.com/juntang-zhuang/torch-ACA}.

artificial intelligence, machine learning, solver, (15 more...)

arXiv.org Machine Learning

2006.02493

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Connecticut > New Haven County > New Haven (0.04)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback